Search CORE

21 research outputs found

Query Load Balancing by Caching Search Results in Peer-to-Peer Information Retrieval Networks

Author: Hiemstra Djoerd
Tigelaar Almer S.
Publication venue: Amsterdam University Press
Publication date: 01/01/2011
Field of study

For peer-to-peer web search engines it is important to keep the delay between receiving a query and providing search results within an acceptable range for the end user. How to achieve this remains an open challenge. One way to reduce delays is by caching search results for queries and allowing peers to access each others cache. In this paper we explore the limitations of search result caching in large-scale peer-to-peer information retrieval networks by simulating such networks with increasing levels of realism. We find that cache hit ratios of at least thirty-three percent are attainable

CiteSeerX

University of Twente Research Information

Query-Based Sampling using Only Snippets

Author: Hiemstra Djoerd
Tigelaar Almer S.
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2009
Field of study

Query-based sampling is a popular approach to model the content of an uncooperative server. It works by sending queries to the server and downloading the returned documents in the search results in full. This sample of documents then represents the server’s content. We present an approach that uses the document snippets as samples instead of downloading entire documents. This yields more stable results at the same amount of bandwidth usage as the full document approach. Additionally, we show that using snippets does not necessarily incur more latency, but can actually save time

CiteSeerX

Radboud Repository

University of Twente Research Information

Query-Based Sampling using Snippets

Author: Hiemstra D.
Tigelaar Almer S.
Publication venue: ACM
Publication date: 01/01/2010
Field of study

Query-based sampling is a commonly used approach to model the content of servers. Conventionally, queries are sent to a server and the documents in the search results returned are downloaded in full as representation of the server’s content. We present an approach that uses the document snippets in the search results as samples instead of downloading the entire documents. We show this yields equal or better modeling performance for the same bandwidth consumption depending on collection characteristics, like document length distribution and homogeneity. Query-based sampling using snippets is a useful approach for real-world systems, since it requires no extra operations beyond exchanging queries and search results

Radboud Repository

University of Twente Research Information

Peer-to-peer information retrieval

Author: Tigelaar Almer S.
Publication venue: University of Twente, Centre for Telematics and Information Technology (CTIT)
Publication date: 01/01/2012
Field of study

The Internet has become an integral part of our daily lives. However,the essential task of finding information is dominated by a handful of large centralised search engines. In this thesis we study an alternative to this approach. Instead of using large data centres, we propose using the machines that we all use every day: our desktop, laptop and tablet computers, to build a peer-to-peer web search engine. We provide a definition of the associated research field: peer-to-peer information retrieval. We examine what separates it from related fields, give an overview of the work done so far and provide an economic perspective on peer-to-peer search. Furthermore, we introduce our own architecture for peer-to-peer search systems, inspired by BitTorrent. Distributing the task of providing search results for queries introduces the problem of query routing: a query needs to be sent to a peer that can provide relevant search results. We investigate how the content of peers can be represented so that queries can be directed to the best ones in terms of relevance. While cooperative peers can provide their own representation, the content of uncooperative peers can be accessed only through a search interface and thus they can not actively provide a description of themselves. We look into representing these uncooperative peers by probing their search interface to construct a representation. Finally, the capacity of the machines in peer-to-peer networks differs considerably, making it challenging to provide search results quickly. To address this, we present an approach where copies of search results for previous queries are retained at peers and used to serve future requests and show participation can be incentivised using reputations. There are still problems to be solved before a real-world peer-to-peer web search engine can be built. This thesis provides a starting point for this ambitious goal and also provides a solid basis for reasoning about peer-to-peer information retrieval systems in general

University of Twente Research Information

Peer to Peer Information Retrieval: An Overview

Author: Hiemstra Djoerd
Tigelaar Almer S.
Trieschnigg Dolf
Publication venue: ACM
Publication date: 01/01/2012
Field of study

Peer-to-peer technology is widely used for file sharing. In the past decade a number of prototype peer-to-peer information retrieval systems have been developed. Unfortunately, none of these have seen widespread real- world adoption and thus, in contrast with file sharing, information retrieval is still dominated by centralised solutions. In this paper we provide an overview of the key challenges for peer-to-peer information retrieval and the work done so far. We want to stimulate and inspire further research to overcome these challenges. This will open the door to the development and large-scale deployment of real-world peer-to-peer information retrieval systems that rival existing centralised client-server solutions in terms of scalability, performance, user satisfaction and freedom

Radboud Repository

University of Twente Research Information

Matching Queries to Frequently Asked Questions: Search Functionality for the MRSA Web-Portal

Author: Akker Rieks op den
Tigelaar Almer S.
Verhoeven Fenne
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2009
Field of study

As part of the long-term EUREGIO MRSA-net project a system was developed which enables health care workers and the general public to quickly find answers to their questions regarding the MRSA pathogen. This paper focuses on how these questions can be answered using Information Retrieval (IR) and Natural Language Processing (NLP) techniques on a Frequently-Asked-Questions-style (FAQ) database

University of Twente Research Information

DEIRA: A Dynamic Engaging Intelligent Reporter Agent (Demo Paper)

Author: Alofs Thijs
Knoppel François L.A.
Oude Bos Danny
Tigelaar Almer S.
Publication venue: Twente University Press (TUP)
Publication date: 01/01/2008
Field of study

University of Twente Research Information